-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sublayer compute function and example project for dense #62
Conversation
this looks like a great start. I did a quick check and the results are similar between the sublayers and layer computations -- but not exactly the same. I guess that you have to store the intermediate values and waste FFs. Two next thoughts:
|
Regarding the point on pruning, that is worth a try. We could also switch to calculating the number of nonzero multiplications on fly like we do for the convolutional: https://github.com/hls-fpga-machine-learning/hls4ml/blob/master/nnet_utils/nnet_conv.h#L108-L109 |
^^^ this Maybe it's good to develop consistent machinery between conv and mlp? |
@nhanvtran we never replied to your idea about doing loops within loops. My feeling is that by doing the separate sublayer calls within a loop, you'll end up with the same problem (i.e. it's going to try to unroll everything). This is why I imagined having the |
You can test this
The model is a big dense model: https://github.com/hls-fpga-machine-learning/keras-training/blob/master/models/models.py#L7 |
@jmduarte writing the sublayers sequentially also works |
So it looks like it's working well but I'm a little concerned about how this looks to the user. Is there a way to "wrap" all the sublayer calls so it's not in the main function of the HLS project. Similarly (and probably a little more importantly) there are so many sublayer configurations that doing some fine-tuning beyond the yaml configuration looks intractable. What do you think? |
@nhanvtran, the latest commit addresses your comment about the aesthetics. I think the code looks more straightforward with the many sublayer calls factorized into their own functions at the bottom of the Take a look (you can run the config referenced above) and let me know if it's ok. Thanks, |
tested and will merge so that we can proceed to conv sublayers |
Add sublayer compute function and example project for dense
It seems that the problem still exists with the new writer. I am converting some kind of large CNN model like this
And the problem seems to occur in the dense layer. Is there any solution? |
…ing/jmgd/sublayer Add sublayer compute function and example project for dense
I am facing the same issue. I am converting a basic CNN. model = Sequential()
model.add(Conv2D(64, kernel_size=3, activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax')) I get the same error
Is there any solution for it? |
hi @marina-neseem, this just means you're trying to parallelize/unroll fully, e.g. using There is another dataflow scheme in hls4ml called See https://fastmachinelearning.org/hls4ml/details.html#i-o-types |
I try to set it to io_type='io_stream', but I get the same error. |
This is a PR to fix the memory problem (issue #59) when unrolling large loops.
The idea is to break up the loop by partitioning the output array for each layer call.
This PR only addresses the fully connected layer.